WebWise: navigating the Human Genome Project.

نویسنده

  • K D Pruitt
چکیده

The Human Genome Project has increased the rate of DNA sequence accumulation to the point where information management has become a formidable task. The central repositories for this avalanche of data, GenBank, EMBL (European Molecular Biology Laboratory), and DDBJ (DNA Data Bank of Japan), continue to accumulate DNA sequences at an unprecedented rate. For example, the total number of nucleotides stored in the GenBank database more than doubles every 18 months (Benson et al. 1997). The scientific community is clearly interested in supporting rapid access to high-quality DNA sequence, and, although this remains controversial (Adams and Venter 1996; Bentley 1996), in supporting release of ‘‘unfinished’’ DNA sequence data generated by the sequencing centers. (Unfinished DNA sequences generated from a cosmid, BAC, or P1 clone may include nucleotide errors and may consist of unordered or ordered contigs with one or more gaps.) Since the process of ‘‘finishing’’ a sequence (which includes resolving any ambiguous bases, contig assembly, gap closure, and annotation) proceeds at a much slower pace than the initial production of sequence, a considerable amount of unfinished sequence can accumulate at the sequencing centers. Growing interest in timely dissemination of all the data, plus the perception that uneven access to the unfinished DNA sequences could confer an unfair advantage (or disadvantage) to research groups, resulted in increasing pressure on the sequencing centers and international databases to make even unfinished DNA sequence data publicly accessible. This intent was formalized at the International Strategy Meeting on Human Genome Sequencing (held in Bermuda in February 1996) where a consortium of sequencing centers and funding agencies agreed that (1) all publicly funded human sequence data should be promptly released into the public domain, and (2) to promote coordination, sequencing centers should inform a central database of their intent to sequence a given region (‘‘the Bermuda Principles,’’ Smith and Carrano 1996). GenBank, EMBL, and DDBJ have supported this agreement by forming a new functional division entirely of unfinished DNA sequence data [high throughput genomic (HTG) sequence] (Ouellette and Boguski 1997); to support the coordination effort, these international databases are developing web sites to collect and display information concerning the chromosomal regions targeted for sequencing. [The Human Genome Sequence Index (HGSI) will be available from NCBI’s home page: http://www. ncbi.nlm.nih.gov/.] On an individual level, however, the sequencing centers themselves have risen to the task of making their data fully available by establishing and maintaining World Wide Web sites. In fact, these web sites have become an integral component of the ongoing sequencing effort. A sequencing center’s web site is often the best place to find an integrated overview of the mapping and sequencing progress, the DNA sequence, and future plans for a particular region of interest. The growing importance of these web sites to the sequencing community was made apparent at the recent Hilton Head meeting when several speakers referred to their center’s web site as a further source of information (Ninth International Genome Sequencing and Analysis Conference. September 13–16, 1997. Hyatt Regency, Hilton Head, SC). Not surprisingly, the rapid increase in DNA sequence data has been matched by a proliferation of web sites to disseminate, discuss, or ‘‘link to’’ the actual DNA sequence information. This plethora of web sites, and the wealth of information available there, is a powerful research tool—but one that is likely underused. It can be incredibly timeconsuming and confusing to fully utilize this resource even for an experienced web navigator, as it is often difficult to maneuver through the maze of sites to find the relevant information. Indeed, if you do not have the URL address at hand, it can even be difficult to locate a particular web site. The web has become so large that it is often challenging to phrase a useful search query to locate a particular site. For example, a recent search engine query for ‘‘Human Genome Project’’ and ‘‘sequencing center’’ yielded 43 matches, of which only 2 were to a sequencing center listed in Table 1. The initial difficulty in identifying the correct web site to look at is compounded by the fact that the different sequencing centers have employed a variety of organizational strategies for their web sites. While variety is the hallmark of the web, the lack of any organizational standards can make it even more time-consuming to find data, as it is frequently not at all apparent how to navigate around a web site. Nor is it always obvious exactly what resources are available at a given site. This WebWise series of articles, of which this is the first, is meant to be a navigational aid for sequence sites available on the web. The WebWise series will review the Human Genome Project sequencing centers’ web sites. In addition to simply pointing the way to the different centers, this series will provide an outline of each center’s organizational strategy, discuss the type of information available there, and evaluate the general ease of use. There are many web sites that proE-MAIL [email protected]; FAX (301) 435-2433. Insight/Outlook

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

WebWise: guide to the University of Oklahoma's Advanced Center for Genome Technology web site.

This installment of the WebWise series reviews the University of Oklahoma Advanced Center for Genome Technology (ACGT) web site. This sequencing center is involved in sequencing several bacterial and fungal genomes in addition to their human genome sequencing effort. Although there is no intent to minimize the significance or volume of their sequencing effort in other genomes, this article does...

متن کامل

WebWise: guide to the Stanford Human Genome Center and the Whitehead/MIT Genome Center web sites.

This installment of the WebWise series reviews two web sites: (1) the Stanford Human Genome Center (SHGC; http://www-shgc.stanford.edu/), and (2) the Whitehead/MIT Genome Center (W/MIT; http://www-genome.wi.mit. edu/). Although the overall organization and style of these two web sites are quite different, the centers share an important feature. Both sites make available a large amount of physic...

متن کامل

WebWise: guide to the joint genome institute web site.

This installment of the WebWise series reviews the Joint Genome Institute (JGI) web site (http://www.jgi.doe.gov/). The JGI, established in 1997 (Casey 1996), represents an ongoing consolidation of the U.S. Department of Energy (DOE) genome sequencing centers established at the Los Alamos (LANL), Lawrence Berke ley (LBNL) , and Lawrence Livermore National Laboratories (LLNL). The JGI web site p...

متن کامل

I-49: Human Y Chromosome ProteomeProject

The success of the Human Genome Project (HGP) has provided a blueprint for the approximately 20,000 gene-encoded proteins potentially active in all of the hundreds of cell types that make up the human body. Yet we still have limited knowledge about a majority of the gene-encoded proteins which are the “building blocks of life” and “cellular machinery”. It is estimated that for nearly half of th...

متن کامل

Welcome to WebWise

The Institute of Museum and Library Services is the primary source of federal support for the nation's 122,000 libraries and 17,500 museums. The Institute's mission is to create strong libraries and museums that connect people to information and ideas. The Institute works at the national level and in coordination with state and local organizations to sustain heritage, culture, and knowledge; en...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Genome research

دوره 7 11  شماره 

صفحات  -

تاریخ انتشار 1997